New Grid-Based Algorithms for Partially Observable Markov Decision Processes: Theory and Practice
نویسنده
چکیده
We present two new algorithms for Partially Observable Markov Decision Processes (pomdps). The first algorithm is a general grid-based algorithm for pomdps with theoretical optimality guarantees. The other algorithm is for the subclass of problems known as Stochastic Shortest-Path problems in belief space. Both algorithms are optimal and robust with respect to a novel robustness criterion that is also in the paper. In the practical side, we test the approach over a number of diverse problems.
منابع مشابه
An Improved Grid-Based Approximation Algorithm for POMDPs
Although a partially observable Markov decision process (POMDP) provides an appealing model for problems of planning under uncertainty, exact algorithms for POMDPs are intractable. This motivates work on approximation algorithms, and grid-based approximation is a widely-used approach. We describe a novel approach to grid-based approximation that uses a variable-resolution regular grid, and show...
متن کاملA POMDP Framework to Find Optimal Inspection and Maintenance Policies via Availability and Profit Maximization for Manufacturing Systems
Maintenance can be the factor of either increasing or decreasing system's availability, so it is valuable work to evaluate a maintenance policy from cost and availability point of view, simultaneously and according to decision maker's priorities. This study proposes a Partially Observable Markov Decision Process (POMDP) framework for a partially observable and stochastically deteriorating syste...
متن کاملAccelerated decomposition techniques for large discounted Markov decision processes
Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...
متن کاملAn 2-Optimal Grid-Based Algorithm for Partially Observable Markov Decision Processes
We present an 2-optimal grid based algorithm for pomdps that is tractable in 2, the discount factor and the maximum absolute value of the cost function, but exponential in the dimension of the state space. To the best of our knowledge, this is the first optimal grid-based algorithm for pomdps: all other optimal algorithms that we know are based on Sondik’s representation of the Value Function. ...
متن کاملTraining a real-world POMDP-based Dialogue System
Partially Observable Markov Decision Processes provide a principled way to model uncertainty in dialogues. However, traditional algorithms for optimising policies are intractable except for cases with very few states. This paper discusses a new approach to policy optimisation based on grid-based Q-learning with a summary of belief space. We also present a technique for bootstrapping the system ...
متن کامل